A racket sport, which can be played in singles or doubles.
Stringed tennis racquets are used by each player to hit a hollow rubber ball with a felt covering over a net and into the opposing court.
The earliest known version of tennis was a handball game practiced in France around the 12th century, called palm (Castorino n.d.).
Three main governing bodies, namely the Association of Tennis Professionals (ATP), the Women’s Tennis Association (WTA) and the International Tennis Federation (ITF) (Score and Change n.d.).
Hard, clay, grass, carpet court and so on.
Court characteristics then allow for the production of many different types of balls, low to high bouncing ones and slow to fast ones.
Many ball producers, some of the famous ones being Dunlop, Wilson, Prince and so on.
Australian Open in January, the French Open from late May to early June, Wimbledon in late June to early July, and the US Open in August to September.
The Australian and US open take place on hard courts, whereas the French Open and Wimbledon take place on clay and grass courts respectively.
Tennis matches are often broken up into sets, with each set consisting of many games (USTA n.d.).
The most common scoring system for professional matches including Grand Slams is best of three for women and best of five for men.
To win a set, players typically need to win six games, where they can earn points, progressing from 0, 15, 30, and 40.
If there is no Deuce and a player passes 40 points, he/she wins that one particular game.
Tennis Court, Racquet and Ball
Grand Slam Courts and Their Type of Surfaces
The “WTA/ATP Tennis” data set utilised in this study was obtained from Kaggle (2021), in which it comprises of player information, WTA and ATP statistics and results for matches from 1949-2021.
Notably, the data was originally collected by Jeff Sackman (2023) and should be attributed by anyone who wants to use this data set in the future.
Bear in mind that starting in 1968, tennis was referred to as the Open Era as it was decided that both professionals and amateurs would be able to compete in Grand Slam events (Tennis Companion n.d.), discarding any divisions that previously existed.
I would also like to clarify that plots number 7 to 10 in this documentary are inspired by plots from this link.
However, I have made changes here and there such as utilising different years or variables, and visually representing them differently according to my preferences and what I feel is best.
There are two files, namely KaggleMatches.csv and KagglePlayers.csv, containing the match and player information respectively.
The matches’ data set consists of 50 columns, while the players’ data set consists of 7 columns.
The three main objectives of this documentary are:
In a nutshell, a myriad of important insights have been obtained throughout this documentary. As such, I will highlight the main points to jog our memory a little:
The United States, followed by United Kingdom and Australia have the most tennis player representatives.
As the number of tennis players a country has increases, the winning and losing percentages of players in all tournaments converge towards a 50-50 percentage.
The average age of tennis players fluctuate through different periods of time. Female average age is continuously below the male tennis players after the Open Era.
The match duration on clay courts is the highest, while that of grass courts is the lowest.
The serving performance is not that different between left and right handed males, but is slightly better for left handed females than right handed females.
Height does not influence male players’ winning percentage. However, there is a slight positive correlation between female height and winning percentage.
Serena Williams has the most Grand Slams in the Open Era as of 2021.
The heatmap of winning percentages for the top players across each round of Grand Slams can provide information on the weaknesses and strengths of players across different court surfaces.
Even top performers have bogey players in that they would need to adapt their gameplay accordingly.
David Ferrer is one of the best players existing to have never won a Grand Slam.
For tennis enthusiasts, looking at commentary and analysis through visuals is like icing on the cake. Fans can benefit from and gain a deeper understanding of the game if they are provided with statistical insights and data-driven storytelling, hopefully like the ones provided in this documentary. Using data analysis, tennis fans can have a greater understanding of the sport by learning about the intricacies of player performance, tournament dynamics and different characteristics or aspects influencing the game. It can also be used to make the sport more approachable and interesting to viewers, increasing their engagement as key statistics and crucial trends are broadcasted in a simple storytelling manner.
On a higher level, coaches, players and relevant parties can benefit from analysing player characteristics and statistics, performance metrics and match attributes to pinpoint areas for development, thus formulating winning strategies. Training schedules, strategies and game results can all benefit from more rigorous data analysis. Not only does analysis on tennis enrich the entertainment industry, but it also enhances player performance and drives advancement in the field too!